MonolayerCultures / src / Oligodendroglia / [RC17]Monocle_for_Oligodendroglia.Rmd
[RC17]Monocle_for_Oligodendroglia.Rmd
Raw
---
title: "[RC17]Pseudotime_Analysis for Oligodendroglia"
author: "Nina-Lydia Kazakou"
date: "17 March 2022"
output: html_document
---

Pseudotime is a measure of how much progress an individual cell has made through a process such as cell differentiation.

In many biological processes, cells do not progress in perfect synchrony. In single-cell expression studies of processes such as cell differentiation, captured cells might be widely distributed in terms of progress. That is, in a population of cells captured at exactly the same time, some cells might be far along, while others might not yet even have begun the process. This asynchrony creates major problems when you want to understand the sequence of regulatory changes that occur as cells transition from one state to the next. Tracking the expression across cells captured at the same time produces a very compressed sense of a gene's kinetics, and the apparent variability of that gene's expression will be very high.

By ordering each cell according to its progress along a learned trajectory, Monocle alleviates the problems that arise due to asynchrony. Instead of tracking changes in expression as a function of time, Monocle tracks changes as a function of progress along the trajectory, which we term "pseudotime". Pseudotime is an abstract unit of progress: it's simply the distance between a cell and the start of the trajectory, measured along the shortest path. The trajectory's total length is defined in terms of the total amount of transcriptional change that a cell undergoes as it moves from the starting state to the end state.

# Set-up

### Load libraries 
```{r message=FALSE, warning=FALSE}
library(Seurat)
library(dplyr)
library(monocle)
library(here)
library(ggsci)
```

### Colour Palette

```{r load_palette}
mypal <- pal_npg("nrc", alpha = 0.7)(10)
mypal2 <-pal_tron("legacy", alpha = 0.7)(7)
mypal3<-pal_lancet("lanonc", alpha = 0.7)(9)
mypal4<-pal_simpsons(palette = c("springfield"), alpha = 0.7)(16)
mypal5 <- pal_rickandmorty(palette = c("schwifty"), alpha = 0.7)(6)
mypal6 <- pal_futurama(palette = c("planetexpress"), alpha = 0.7)(5)
mypal7 <- pal_startrek(palette = c("uniform"), alpha = 0.7)(5)
mycoloursP<- c(mypal, mypal2, mypal3, mypal4)
```

#### Load Seurat Object
```{r}
oligos <- readRDS(here("data", "Processed", "Oligodendroglia", "hES-derived_monolayerOL.rds"))
Idents(oligos) <- "ClusterID"
```


# Monocle

### Extract data, phenotype data and feature data from the rt
```{r Extract_Data, eval=FALSE}
oligos_data <- as(as.matrix(oligos@assays$RNA@data), 'sparseMatrix')

pd <- new('AnnotatedDataFrame', data = oligos@meta.data)

fData <- data.frame(gene_short_name = row.names(oligos_data), row.names = row.names(oligos_data))
fd <- new('AnnotatedDataFrame', data = fData)

```

```{r eval=FALSE}
dir.create(here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis"))
dir.create(here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle"))
```

### Construct monocle cds
```{r Construct_cds, eval=FALSE}
oligos_cds <- newCellDataSet(oligos_data,
                         phenoData = pd,
                         featureData = fd,
                         lowerDetectionLimit = 0.5,
                         expressionFamily = negbinomial.size())

View(pData(oligos_cds)) 

# OPTIONAL: Pick only the ones we want (for example if we use the full object, like the RC17)
#pData(mouse_cds)$Oligos <- pData(mouse_cds)$seurat_clusters == "0" | pData(mouse_cds)$Oligos == "1" | pData(mouse_cds)$seurat_clusters == "2" | #pData(mouse_cds)$seurat_clusters == "8" | pData(mouse_cds)$seurat_clusters == "4" |pData(mouse_cds)$seurat_clusters == "5" 
#mouse_cds_OL <- mouse_cds[, pData(mouse_cds)$Oligos]

oligos_cds <- estimateSizeFactors(oligos_cds)
oligos_cds <- estimateDispersions(oligos_cds)

clustering_oligos <- differentialGeneTest(oligos_cds,
                                              fullModelFormulaStr = '~ClusterID',
                                              cores = 6, verbose = T)

# select genes that are significant at an FDR <10%
genes_oligos_cds <- subset(clustering_oligos, qval < 0.1) 
    
# my_ordering_genes <- row.names(clustering_DEG_genes)[order(clustering_DEG_genes$qval)][1:1000]
oligos_cds <- setOrderingFilter(oligos_cds, ordering_genes = genes_oligos_cds)
oligos_cds <- reduceDimension(oligos_cds, method = 'DDRTree', cores=6, verbose = TRUE)
oligos_cds <- orderCells(oligos_cds)

my_pseudotime_de <- differentialGeneTest(oligos_cds,
                                             fullModelFormulaStr = "~sm.ns(Pseudotime)",
                                             cores = 6)

saveRDS(oligos_cds, here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "oligos_cds.rds"))
saveRDS(my_pseudotime_de, here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "my_pseudotime_de.rds"))
```

```{r}
oligos_cds <- readRDS(here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "oligos_cds.rds"))
my_pseudotime_de <- readRDS(here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "my_pseudotime_de.rds"))
```


# Plots
```{r fig.height=10, fig.width=12}
plot_cell_trajectory(oligos_cds, color_by = "ClusterID") + facet_wrap(~State) + 
  scale_colour_manual(values=mycoloursP[6:40])

plot_cell_trajectory(oligos_cds, color_by = "ClusterID") + 
  facet_wrap(~ClusterID) +
  scale_colour_manual(values=mycoloursP[6:40])
```

```{r fig.height=7, fig.width=8}
plot_cell_trajectory(oligos_cds, color_by = "State")  + 
  scale_colour_manual(values=mycoloursP[6:40])

#plot_cell_trajectory(oligos_cdsoligos_cds, color_by = "Sample") + facet_wrap(~Sample)
plot_cell_trajectory(oligos_cds, color_by = "Phase", cell_size = 0.2)+
  scale_colour_manual(values=mycoloursP[21:40])

plot_cell_trajectory(oligos_cds, color_by = "Pseudotime")

plot_cell_trajectory(oligos_cds, color_by = "ClusterID")+
  scale_colour_manual(values=mycoloursP[6:40])
```

```{r eval=FALSE}
plot_1 <- plot_cell_trajectory(oligos_cds, color_by = "ClusterID") + facet_wrap(~State) + 
  scale_colour_manual(values=mycoloursP[6:40])
plot_2 <- plot_cell_trajectory(oligos_cds, color_by = "ClusterID") + 
  facet_wrap(~ClusterID) +
  scale_colour_manual(values=mycoloursP[6:40])

plot_3 <- plot_cell_trajectory(oligos_cds, color_by = "State")  + 
  scale_colour_manual(values=mycoloursP[6:40])
plot_4 <- plot_cell_trajectory(oligos_cds, color_by = "Phase", cell_size = 0.2)+
  scale_colour_manual(values=mycoloursP[21:40])
plot_5 <- plot_cell_trajectory(oligos_cds, color_by = "Pseudotime")
plot_6 <- plot_cell_trajectory(oligos_cds, color_by = "ClusterID")+
  scale_colour_manual(values=mycoloursP[6:40])
```

```{r eval=FALSE}
pdf(file = here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "ClusterID_State.pdf"),   
              width = 12, 
              height = 10) 
print(plot_1)
dev.off()

pdf(file = here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "ClusterID_split.pdf"),   
              width = 12, 
              height = 10) 
print(plot_2)
dev.off()


pdf(file = here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "State.pdf"),   
              width = 8, 
              height = 7) 
print(plot_3)
dev.off()

pdf(file = here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "Phase.pdf"),   
              width = 8, 
              height = 7) 
print(plot_4)
dev.off()

pdf(file = here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "Pseudotime.pdf"),   
              width = 8, 
              height = 7) 
print(plot_5)
dev.off()

pdf(file = here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "ClusterID.pdf"),   
              width = 8, 
              height = 7) 
print(plot_6)
dev.off()

```


```{r eval=FALSE}
head(pData(oligos_cds))
#the pseudotime and its state is now in the pData

my_pseudotime_de <- differentialGeneTest(oligos_cds,
                                         fullModelFormulaStr = "~sm.ns(Pseudotime)",
                                         cores = 18)
write.csv(my_pseudotime_de, here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "Pseudo-time_DE.csv"))
```

```{r}
my_pseudotime_de <- readRDS(here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "my_pseudotime_de.rds"))
```


```{r fig.height=8, fig.width=6}
# select the 10 highest genes and use them for plotting
top_genes <- my_pseudotime_de %>% arrange(qval) %>% head(10)

top_genes

write.csv(top_genes, here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "Pseudo-time_DE_Top10.csv"))

plot_genes_in_pseudotime(oligos_cds[rownames(top_genes)])

plot_genes_in_pseudotime(oligos_cds[rownames(top_genes)], color_by = "ClusterID") +  
  scale_colour_manual(values=mycoloursP[6:40])
```

```{r eval=FALSE}
plot_7 <- plot_genes_in_pseudotime(oligos_cds[rownames(top_genes)], color_by = "ClusterID")+  
  scale_colour_manual(values=mycoloursP[6:40])

pdf(file = here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "ClusterID_Top10_DE.pdf"),   
              width = 12, 
              height = 10) 
print(plot_7)
dev.off()
```

```{r fig.height=10, fig.width=6}
plot_genes_in_pseudotime(oligos_cds[c("OLIG2", "SOX10", "PDGFRA", "NKX2-2", "MYC", "EGFR", "DLL1", "HOPX", "SPARCL1", "MBP", "MAG")], 
                         color_by = "ClusterID") +  
  scale_colour_manual(values=mycoloursP[6:40])
```


```{r eval=FALSE}
plot_8 <- plot_genes_in_pseudotime(oligos_cds[c("OLIG2", "SOX10", "PDGFRA", "NKX2-2", "MYC", "EGFR", "DLL1", "HOPX", "SPARCL1", "MBP", "MAG")], 
                         color_by = "ClusterID") +  
  scale_colour_manual(values=mycoloursP[6:40])

pdf(file = here("outs", "Processed", "Oligodendroglia", "Pseudo-time_Analysis", "Monocle", "Cluster_Markers.pdf"),   
              width = 12, 
              height = 10) 
print(plot_8)
dev.off()
```


```{r}
sessionInfo()
```